Long noncoding RNAs are rarely translated in two human cell lines.

نویسندگان

  • Balázs Bánfai
  • Hui Jia
  • Jainab Khatun
  • Emily Wood
  • Brian Risk
  • William E Gundling
  • Anshul Kundaje
  • Harsha P Gunawardena
  • Yanbao Yu
  • Ling Xie
  • Krzysztof Krajewski
  • Brian D Strahl
  • Xian Chen
  • Peter Bickel
  • Morgan C Giddings
  • James B Brown
  • Leonard Lipovich
چکیده

Data from the Encyclopedia of DNA Elements (ENCODE) project show over 9640 human genome loci classified as long noncoding RNAs (lncRNAs), yet only ~100 have been deeply characterized to determine their role in the cell. To measure the protein-coding output from these RNAs, we jointly analyzed two recent data sets produced in the ENCODE project: tandem mass spectrometry (MS/MS) data mapping expressed peptides to their encoding genomic loci, and RNA-seq data generated by ENCODE in long polyA+ and polyA- fractions in the cell lines K562 and GM12878. We used the machine-learning algorithm RuleFit3 to regress the peptide data against RNA expression data. The most important covariate for predicting translation was, surprisingly, the Cytosol polyA- fraction in both cell lines. LncRNAs are ~13-fold less likely to produce detectable peptides than similar mRNAs, indicating that ~92% of GENCODE v7 lncRNAs are not translated in these two ENCODE cell lines. Intersecting 9640 lncRNA loci with 79,333 peptides yielded 85 unique peptides matching 69 lncRNAs. Most cases were due to a coding transcript misannotated as lncRNA. Two exceptions were an unprocessed pseudogene and a bona fide lncRNA gene, both with open reading frames (ORFs) compromised by upstream stop codons. All potentially translatable lncRNA ORFs had only a single peptide match, indicating low protein abundance and/or false-positive peptide matches. We conclude that with very few exceptions, ribosomes are able to distinguish coding from noncoding transcripts and, hence, that ectopic translation and cryptic mRNAs are rare in the human lncRNAome.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Comparative Analysis of Human Protein-Coding and Noncoding RNAs between Brain and 10 Mixed Cell Lines by RNA-Seq

In their expression process, different genes can generate diverse functional products, including various protein-coding or noncoding RNAs. Here, we investigated the protein-coding capacities and the expression levels of their isoforms for human known genes, the conservation and disease association of long noncoding RNAs (ncRNAs) with two transcriptome sequencing datasets from human brain tissue...

متن کامل

Evaluation of Long Non-Coding RNAs: HNF1A-AS1 and MVIH Expressions and Their Clinical Significance in Human Gastric Cancer

Gastric cancer is one of the most common cancers in the world. Late diagnosis is the main cause of the high rate of treatment failure and death among patients with gastric cancer; therefore identifying the molecular basis of cancer initiation and metastasis is so critical for developing efficient methods for early diagnosis and therapy. long non-coding RNAs (lncRNAs) are the largest group of no...

متن کامل

Noncoding RNA in the transcriptional landscape of human neural progenitor cell differentiation

Increasing evidence suggests that noncoding RNAs play key roles in cellular processes, particularly in the brain. The present study used RNA sequencing to identify the transcriptional landscape of two human neural progenitor cell lines, SK-N-SH and ReNcell CX, as they differentiate into human cortical projection neurons. Protein coding genes were found to account for 54.8 and 57.0% of expressed...

متن کامل

Long noncoding RNAs: functional surprises from the RNA world.

Most of the eukaryotic genome is transcribed, yielding a complex network of transcripts that includes tens of thousands of long noncoding RNAs with little or no protein-coding capacity. Although the vast majority of long noncoding RNAs have yet to be characterized thoroughly, many of these transcripts are unlikely to represent transcriptional "noise" as a significant number have been shown to e...

متن کامل

Epigenetic: A missing paradigm in cellular and molecular pathways of sulfur mustard lung: a prospective and comparative study

Sulfur mustard (SM, bis- (2-chloroethyl) sulphide) is a chemical warfare agent that causes DNA alkylation, protein modification and membrane damage. SM can trigger several molecular pathways involved in inflammation and oxidative stress, which cause cell necrosis and apoptosis, and loss of cells integrity and function. Epigenetic regulation of gene expression is a growing research topic and is ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Genome research

دوره 22 9  شماره 

صفحات  -

تاریخ انتشار 2012